Channel: Learn By Watch
Category: Education
Tags: gradient descent machine learningnormal equationsgradient descent algorithmoptimization algorithmsneural networkspartial derivatives machine learningwhat is gradient descentgradient descent mathmachine learning tutoriallinear regression channel trading strategycost functionfeature scaling in machine learningmachine learning algorithmsmachine learning for beginnersgradient descentmachine learning full coursepolynomial regression
Description: Gradient descent is an optimization algorithm that's used when training a machine learning model. It's based on a convex function and tweaks its parameters iteratively to minimize a given function to its local minimum. In this video you will learn about these topics: ● Recap - Quick recap of hypothesis and cost function and the aim that we need to reduce the cost function as much as we can. ● Outline of gradient descent - The algorithm that has to be followed in gradient descent. ● Update theta - Formula for updating theta ● Alpha (learning rate parameter) - Explained how the value of alpha can affect the training of the model ● Cost function graph ● Partial derivatives - Formula for calculating the partial derivatives ● Algorithm - Algorithm of gradient descent with formula ● Predicting the output of a new dataset - explained how we can use our trained model to predict the outcome of new data in our test set. ● Multivariable linear regression - How to deal with problems which have more than one input feature with linear regression. ● Algorithm for multivariable linear regression ● Feature Scaling - Formula for feature scaling and explained that feature scaling is important before we feed our dataset into the model for training. ● Polynomial regression - Not all problems can be handled by fitting a straight line. Therefore we have polynomial regression. Explained it by updating some terms in linear regression. ● Normal equation - A one line formula to get our model trained. It is not preferred since on a larger dataset it takes much more time than gradient descent.